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ABSTRACT 



The use of Procrustean rotation as a procedure for 



assessing the invariance of stucTy results is proposed. Researchers 
have long relied on significance testing as a measure of judging the 
worthiness of empirical findings. However, significance testing has 
come under fire because it does not provide information about the 
importance or replicability of results. A major misconception is 
confusing statistical significance testing with reproducibility. 
Assessing the invariance of study results is a useful alternative. An 
example is given of an invariance technique following a discriminant 
analysis. The analysis was calculated from a hypothetical data set 
with 64 cases and two predictor variables. The rotation^ technique can 
be used as a cross-validation procedure, splitting the data from a 
single sample and comparing the factor vectors from each half. A 
Procrustean rotation forces orthogonal (uncorrelated) functions of 
factors to a "best fit" position after setting the factor vectors to 
unit length to equalize the contribution of each factor vector to the 
determination of the amount of rotation necessary. The RELATE 
computer program of D. J. Veldman was used for the necessary 
calculations. Four tables present data from the analysis. A 20-item 
list of references is included. (SLD) 
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ABSTRACT 

Researchers have long relied on significance test' ng as a measure of 
judging the worthiness of empirical findings. However, in the last two 
decades, significance testing has come under fire from prominent research- 
ers. Statistical significance testing does not provide information about 
the Importance or the repl icabil ity of results. A major misconception is 
the confusing of statistical significance testing with rep, odui ibil ity. 
Thoughtful researchers have begun to place Importance on repl icabil ity of 
results. Perhaps one reason for the difficulty in "exorcising the null hy- 
pothesis" is that researchers do not feel a suitable substitute has been 
offered. The present paper offers one such alternative--assess1ng the 
invariance of study results. The use of Procrustean rotation as an 
invariance procedure is the focus of this paper. A concrete example is 
provided. 
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Significance testing has long been the measure for judging the wor- 
thiness of empirical findings. Adherence to this measure is based on the 
rationale that "the larger two random samples are, the closer should be 
their means on any measure of interest, provided the samples are from the 
same population" (Fish, 1986, pp. 1-2). As Fish (1986) notes, "the logic 
of statistical significance testing is at first compelling, for it is 
based on [a] perfectly reasonable assumption" (p. 1). However, in the last 
two decades, significance testing has come under fire from prominent re- 
searchers such as Cronbach (1975), who asserts that "the time has arrived 
to exorcise the null hypothesis" (p. 124), and Shulman (1970), who main- 
tains that "the time has arrived for educational researchers to divest 
themselves of the yoke of statistical hypothesis testing" (p. 389). 

One reason for the growing disenchantment with statistical sig- 
nificance testing in some quarters is the strong effect that sample size 
has on the results of a statistical test of the null hypothesis. An ex- 
ample by Thompson (1989) clearly makes this point. Thompson establishes a 
fixed effect size of 33.6%, considered large in social science research. 
By using a sample size of 13 cases Thompson is able to create 
non-significant results, but by increasing his sample size to 23, he pro- 
duces statistical significance. Although Thompson in this example employs 
a large effect size, the same dynamic, i.e., different outcomes resulting 
from adding or loosing a few subjects can occur at any sample size. Car- 
ver (1978) confirms that "a mean difference that is small and not sig- 
nificant from a research standpoint can be statistically significant just 
because enough subjects were used in the experiment to make the result 
statistically rare under the null hypothesis" (p. 388). 
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The problem with blind reliance on significance testing is that it 
leads to misinterpretations of study findings. Suppose a researcher ob- 
tains a result that is statistically significant at the £ = .10 or a = 
.15, but that with invariance testing would prove to be quite stable under 
sampling and thus generalizable to the population of interest. Suppose 
further that this researcher does not understand invariance testing and, 
consequently, lets a noteworthy result go unpublished. This unfortunate 
outcome would occur because an alpha level slightly less stringent than 
the commonly accepted a - .05 would be allowed to overshadow the impor- 
tance of the general izabil ity of the findings. If science is the business 
of cumulating knowledge, then general izabil ity of study results warrents 
genuinely serious consideration. 

The reverse is equally true. If this same researcher obtains a re- 
sult significant at the £ = .01 level, but again fails to conduct 
invariance testing and so fails to discover that this time the finding is 
sample specific and not generalizable, the same flawed thinking would be 
employed. Only this time the study would likely be written up and pub- 
lished. The fact that the finding dues not apply to the population would 
likely go unnoticed, an unfortunate outcome of this scenario. Research 
conducted in this manner does not add to a body of knowledge and does not 
advance a field. To the contrary, a more likely result is that such prac- 
tices retard development of a field because important, but statistically 
nonsignificant findings, are not included in the literature, while 
trivial, but statistically significant findings are. 
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There are critics of significance testing (Carver, 1978; Schneider & 
Darcy, 1984) who would agree with Thompson (1988, p. 100) that "sig- 
nificance is not... the end-all and be-all of research." Nonetheless, sig- 
nificance remains a paramount concern in research, causing Rosnow and 
Rosenthal (1989, p. 1277) to chide that "surely, God loves the .06 nearly 
as much as the .05. Can there be any doubt that God views the strength of 
evidence for or against the null as a fairly continuous function of the 
magnitude of p?" The admonition of Kosnow and Rosenthal (1989) notwith- 
standing, evidence indicates that such doubt does exist. 

Carver (1978) notes that in 1977, despite rumblings against sig- 
nificance testing in the research community, only two of the 29 articles 
of empirical research published in the American Educational Research Jour- 
nal did not use significance testing. This finding lead Carver (1978) to 
assert "apparently the case against such testing will have to be stated 
more loudly and more clearly to a wider audience if it is to have any ef- 
fect" (p. 379). Twelve years later, in 1989, the present authors found 
that little had changed. Of 17 empirical articles published in the same 
journal, only one researcher found invariance testing important enough to 
warrant discussion, and only two others reported results that did not in- 
clude levels of statistical significance. 

Thompson (1987, 1988) reminds researchers that statistical sig- 
nificance testing does not provide information about the importance of re- 
sults. For this reason, other analogs must be consulted for an indication 
of result noteworthiness. Carver (1978) concurs with this view, arguing 
that a major misconception of researchers involves the confusion of 
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statistical significanca test results with findings regarding reproduc- 
ibility. 

Thoughtful researchers have begun to place importance on result 
replicability, which, in the view of Carver (1987), "is the cornerstone of 
science" (p. 392). This position on replicability is neither new nor 
novel. According to Tukey (1969), Sir Ronald Fisher, father of modern 
statistical testing, held the view that the "standard of firm knowledge 
was not one very extremely significant result, but rather the ability to 
repeatedly get results significant at 5%. Repetition is the basis for 
judging variability and significance and confidence. Repetition of re- 
sults, each significant, is the basis, according to Fisher, of scientific 
truth" (p. 85). 

Neale and Liebert (1986) corroborate the contention that replication 
is intrinsic to true scientific inquiry, stating "no one study, however 
shrewdly designed and carefully executed, can provide convincing support 
for a causal hypothesis or theorptical statement in the social sciences" 
(p. 290). Perhaps one reason for the difficulty in "exorcising the null 
hypothesis" is that researchers do not feel a suitable substitute has been 
offered. The present paper offers one such alternative--assessin3 the 
invariance of study results. 

Invariance analysis provides more confidence that research results 
are stable and replicable across samples. The current study applied an 
invariance technique following a discriminant analysis. For readers unfa- 
miliar with discriminant analysis, Huberty and Barton (1989) provide a 
very understandable explanation. Traditionally, four approaches have been 
used to assess the stability of discriminant function coefficients: (1) 
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the "empirical" method, (2) the "holdout" ("cross validation," or "split 
half") method, (3) the "Monte Carlo" method, and (4) the "random assign- 
ment" method. Daniel (1989) provides an explanation of each of these ap- 
proaches. Other examples of invariance procedures following a 
discriminant analysis are provided by Jones (1989). The use of the 
Procrustean rotation invariance procedure is the focus of this paper. A 
concrete example is provided. 

Heuristic Example 

For the present paper, a discriminant analysis was calculated from a 
hypothetical data set with 64 cases and two predictor variables, X and Y. 
The first 32 cases were from a data set developed by Fish (1988). Four 
groups of 16 cases each were derived. The data set and the SPSSx commands 
for the discriminant analysis are presented in Tables 1 and 2, so that the 
reader is able to replicate and further explore the analysis. 



INSERT TABLES 1 AND 2 ABOUT HERE 



Procrustean rotation can be used with any multivariate technique. 
The name is derived from Greek mythology. Procrustes, a son of Poseidon, 
forced travelers spending the night at his home to fit his bed by either 
cutting off their legs or stretching their bodies. Similarly a 
Procrustean rotation forces orthogonal (uncorrected) functions of factors 
to a "best fit" position after setting the factor vectors to unit length 
(1.0) "in order to equalize the contribution of each [factor vector] to 
the determination of the amount of rotation necessary" (Veldman, 1967, p. 
238). This rotation technique can be used as a cross-validation 
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procedure, splitting the data from a single sample ano comparing the fac- 
tor vectors from each half. 

The use of factor analysis as a means of validity evaluation is well 
known. Thompson and Pitts (1981/1382) describe this cosine application as 
a rotation of calculated factors to a position of "best fit" with a target 
matrix that has been theoretically derived. The target matrix determines 
how many factors are expected and the expected correlation between each 
item and each factor. Thus, the cosines of the angles between the actual 
and the hypothetical measures can be interpreted as validity coefficients. 

In the present study, by splitting the original sample into two sub- 
sets, discriminant functions were generated resulting in two sets of coef- 
ficients for comparison using the "best fit" rotation method. Thompson 
(1986) provides a detailed review of this empirical method developed by 
Kaiser, Hunka, and Bianchini (1969) for "relating" factors derived from 
different samples of data. This method consists of projecting the two sets 
of factors into the same factor space and calculating the cosines of the 
angles among the factors across the two solutions. These cosines provide 
a measure of the relatedness of the two sets of factors, and are similar 
to correlation coefficients. 

Thompson (1981, 1986) affirms that these coefficients in this appli- 
cation are analogous to test-retest coefficients and has called them 
invariance coefficients. They may also be utilized as adequacy coeffi- 
cients for substantive interpretations. In this application function co- 
efficients are submitted to a Procrustean rotation instead of structure 
coefficients, because the primary concern here is to investigate the 
similarity of the function equations used to produce function scores. 
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Table 3 presents these two sets of discriminant functions for the Table 1 
data. Table 1 data contains the variable INVAR that was created to divide 
the data into eight groups of eight cases each. For this invariance pro- 
cedure, odd groups were analyzed and their function coefficients used for 
Matrix A; Matrix B contained function coefficients from the analysis of 
the even groups. 



INSERT TABLE 3 ABOUT HERE 



Using the RELATE program by Veldman (1967), the standardized 
discriminant function coefficients rrom sample one are input as Matrix A 
and those from sample two as Matrix B. The decision regarding which ma- 
trix is to be designated the target for "best fit" rotation is arbitrary. 
Although the main focus of interest is the resulting matrix cosines, test 
r*s should be first consulted. These evaluate the relation of the given 
variables from the two data sets within the factor space, and must be 
suitable for the functions being rotated to "best fit" to be suitable. 
The results of the rotation indicate a test r of .9993 and .9994 for each 
of the two variables, X and Y, respectively, indicating (since they are 
large) that the two discriminant functions can be rotated in this manner. 

With respect to the resulting cosines among the functions, generally, 
to be considered replicated, functions should have a cosine of roughly .8 
or higher (Thompson & Pitts, 1981/1982). Kaiser suggests .85 is reason- 
able and Gorsuch recommends results greater than .93 as exceptional; how- 
ever, Thompson (1986) presents empirically derived cutoffs as an alterna- 
tive to the theoretically derived cutoffs formulated by others. 
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The cosines among the functions across the solutions for these data 
are presented in Table 4. For this example, Matrix A, Function I has the 
"best fit" with Matrix B, Function II with a cosine of .8377; Matrix A, 
Function II has the best rotated fit with Matrix B, Function I with a co- 
sine of .8377. These cosines are marginal and might be more meaningful 
intuitively were this heuristic example a real substantive study. 



INSERT TABLE 4 ABOUT HERE 



Relatively little is known about the characteristics of the various 
invariance estimates (Jones, 1989). Because of this, Thompson (1984) sug- 
gests that researchers employ several strategies in order to obtain both 
upper and lower bound estimates of the degree of capitalization on sam- 
pling specificity. 

Summary 

Thompson (1986) affirms that "researchers have increasingly recog- 
nized the critical nature of replication as the ultimate test of scien- 
tific findings and some have argued that repl icabil ity should replace sig- 
nificance testing as part of a new logic of truth testing" (p. 27). Even 
though the interpretation of invariance results remains a subjective 
judgment, this does not diminish the need for performing invariance proce- 
dures as an evaluation of repl icabil ity or general izabil ity of analytic 
results. Using two or more invariance procedures provides researchers an 
added measure of confidence regarding their results. 

The present paper has elaborated one alternative applicable with all 
multivariate methods, i.e., Procrustean rotation. The RELATE computer 
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Table 1 

Hypothetical Data Set 



Case 


GrouD 


X 


Y 


INVAR 


Case 


GrouD 


X 


Y 


INVAR 


1 




4 


2 


5 


48 


3 


8 


5 


2 


2 


J 


5 


3 


8 


49 




1 


7 


4 


3 


J 


4 


4 


2 


50 


4 


1 


2 


3 


4 


J 


4 


5 


3 


51 


4 


1 


1 


2 


5 


J 


3 


4 


4 


52 


4 


2 


2 


8 


6 




6 


5 


6 


53 


4 


2 


3 


3 


7 




5 


6 


7 


54 


4 


2 


3 


1 


8 




7 


5 


2 


55 


4 


3 


2 


7 


9 




6 


6 


1 


56 


4 


3 


3 


4 


10 


1 


8 


6 


8 


57 


4 


3 


4 


7 


11 




7 


6 


1 


58 


4 


4 


5 


6 


12 




9 


7 


5 


59 


4 


4 


4 


5 


13 




8 


7 


4 


60 




4 


5 


4 


14 


1 


8 


8 


3 


61 


4 


4 


6 


2 


15 


I 


9 


8 


7 


62 


4 


5 


6 


1 


16 


1 


9 


9 


6 


63 


4 


5 


7 


8 


17 


2 


1 


2 


8 


64 


4 


5 


7 


6 


18 


2 


3 


3 


4 












19 
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3 












20 
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6 












21 
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5 












22 
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4 












23 
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2 












24 
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5 












25 
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6 












26 
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1 












27 
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7 












28 
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8 












29 


2 
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2 












30 
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3 












31 
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7 












32 
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1 












33 
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8 












34 
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6 












35 
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3 












36 
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5 












37 
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2 












38 


3 
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1 












39 
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7 












40 
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5 












41 
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8 












42 
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6 












43 
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4 












44 
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1 
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7 
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7 


3 
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5 
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Table 2 

SPSSx Commands for Discriminant Analysis 



FILE HANDLE MT/NAME='DISCRIMNT.DAT' 

DATA LIST FILE=MT/CASE 1-2 GROUP 7 X 12 Y 17 
LIST VARIABLES CASE TO Y 
DISCRIMINANT GROUPS=GROUP (1,4) 

/VARIABLES- X Y 

/STATISTICS=MEAN STDEV UNIVF RAW 



Table 3 

Matrices Entered Into Procrustean Rotation: Standardized Function Coef- 
ficients of Each Solit-Sample Discriminant Analysis Run 



MATRIX A (TARGET MATRIX, n = 32): 

Function I Function II 

0.80530 1.53832 
1.54426 0.79386 



MATRIX B (n 
Function I 
1.56496 
1.34221 



= 32): 



Function II 

0.20726 

0.83097 
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Table 4 

Cosines Among Factor Axes Resulting From Procrustean 
Rotation Invariance Procedure 



A BY B Function I Function II 

Function I 0.5462 0.8377 

Function II 0.8377 0.5462 



IS 



